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20  Abstract 


studied  by  Schucany . «rni  Frawley  (197  and  Li,  arid  Gt-hucany  (1975). 

in  this  paper  we  show  that  the  test  of  agreement  proposed  by  Schucany 
and  Frawley,  and  further  advanced  by  Li  and  Schucany,  is  misleading 
and  does  not  provide  a satisfactory  answer  to  Kendall’s  question. 

After  pinpointing  various  defects  of  the  Schucany-Frawley  test,  we 
adapt  a procedure,  proposed  by  Wald  and  Wolfowitz  (19*4)  m a slightly 
different  context,  to  furnish  a new  test  for  agreement  between  two 
groups  of  judges. 
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SUMMARY 


The  "problem  of  m rankings",  so  named  by  Kendall  and  studied 
extensively  by  Kendall  and  Bablngton  Smith  (1939) , Kendall  (1970) , and 
others,  considers  the  relationship  between  the  rankings  that  a group 
of  m judges  assigns  to  a set  of  k objects.  Suppose  there  are  two 
groups  of  judges  ranking  the  objects.  Given  that  there  is  agreement 
within  each  group  of  judges , how  can  we  test  for  evidence  of  agreement 
between  the  two  groups?  This  question,  recently  posed  to  us  by  Kendall, 
has  been  studied  by  Schucany  and  Frawley  (1973)  and  Li  and  Schucany  (1975). 
In  this  paper  we  show  that  the  test  of  agreement  proposed  by  Schucany 
and  Frawley,  and  further  advanced  by  Li  and  Schucany,  is  misleading 
and  does  not  provide  a satisfactory  answer  to  Kendall's  question. 

After  pinpointing  various  defects  of  the  Schucany- Frawley  test,  we 
adapt  a procedure,  proposed  by  Wald  and  Wolfowltz  (1944)  in  a slightly 
different  context,  to  furnish  a new  test  for  agreement  between  two 
groups  of  judges. 


Some  key  words:  Conditionally  distribution-free;  Permutation  test; 


1 . INTRODUCTION 
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Suppose  that  a judge  is  presented  with  k objects,  say  k science 
fair  projects,  and  is  asked  to  rank  them.  Then  his  ranking  is  a vector 
r ■ (r  r k)  chosen  according  to  his  probability  distribution 

of  rankings,  Q,  on  the  space  ft  of  k'.  possible  rankings.  When  this 
probability  distribution  is  the  uniform  probability  distribution  U 
(U  assigns  probability  1/k'.  to  each  ranking),  we  say  that  the  judge 
has  no  opinion.  Otherwise,  we  say  that  the  judge  has  an  opinion  which 
is  quantified  by  Q. 

Suppose  that  there  are  m like-minded  male  judges  who  rank  the 

k objects  independently,  producing  the  rankings  r^  ■ (r^ r^) , 

i - 1,  ...,  m.  That  is,  we  assume  r, , ...,  r are  independent  and 

l m 

identically  distributed  random  vectors  in  ft  with  a common  distri- 
bution Q^t  the  opinion  of  the  male  judges.  Next  suppose  that  there 
is  a second  group  of  n like-minded  female  judges  who  rank  the  same 
k objects  independently  and  produce  the  rankings  r^  - (r^,  ...»  r^)  , 
i • m + 1,  ...,  N,  where  N ■ m + n . That.: is , we  assume  r r 

HJTX  N 

are  independent  and  Identically  distributed  random  vectors  in  ft  with 
a common  distribution  Q-,  the  opinion  of  the  female  judges.  How  do  we 
test  that  the  male  and  female  judges  have  a common  opinion? 

Sir  Maurice  Kendall  posed  this  question  to  one  of  us  during  his 
visit  to  Tallahassee  in  the  Spring  of  1976.  In  our  search  of  the 
literature,  we  discovered  that  Shucany  and  Frawley  (1973)  have  pro- 
posed a test  Intended  to  solve  this  problem.  The  Shucany-Frawley  (SF) 
test,  further  advanced  by  LI  and  Schucany  (1975)  and  generalized  by 
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Beckett  (1975)  and  Beckett  and  Shucany  (1975) , Is  baaed  on  the  statistic 
1 defined  by  (1.2)  below. 

Let 

m N 

^4  “ X ^4  4 I j tj  4 > j “ If  •••»  k.  (1.1) 

3 i-1  J J 4-m+l 


The  SF  statistic  is 


k 

L - l S.T  . (1.2) 

j-1  3 3 


It  is  easily  seen  that  1 is  equivalent  to  the  statistic  p,  the 
average  value  of  all  mn  Spearman  rank  order  correlations  of  a ranking 
from  a male  judge  with  a ranking  from  a female  judge:-  More  precisely, 

p - {121  - 3mnk(k+l)2}/{mn(k3-k) } , (1.3) 


where 


m N 

p “ (mn)"1  l l p ,,  (1.4) 

i-1  i'-nri-l 


and 


r o o 

5ii'  " 1 " t{6^i(rij-ri,j)z}/{kJ-k}], 


(1.5) 


Shucany  and  Frawley  reasoned  that  large  values  of  p,  or  equivalently 
large  values  of  1,  should  constitute  evidence  for  the  hypothesis 
of  two-group  agreement.  (H^  is  defined  precisely  by  (2.4)  of  Section 
2.) 

In  Section  2 of  this  paper  we  show  that  the  SF  test  is  misleading, 


and  does  not  constitute  a satisfactory  answer  to  Kendall's  question. 
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The  defects  of  the  SF  test  Include: 

1.  When  m<n,  the  statistic  L gives  too  much  weight  to  the  rankings 
of  male  judges,  and  not  enough  weight  to  the  rankings  of  female 
judges.  When  m>n,  the  situation  is  reversed. 

II.  Critical  values  for  the  SF  test  are  obtained  by  referring  L 
to  its  distribution  under  an  irrelevant  (for  the  problem  under 
discussion)  hypothesis  of  complete  accordance  within  each 
group.  The  hypothesis  Hqq  [see  (2.1)  of  Section  2]  specifies  that 
Q1  - Q2  - U. 

III.  In  Section  2,  equation  (2.2)  defines  the  alternative  Hq^  which 
specifies  that  the  male  judges  have  no  opinion  (Q^  **  U)  but  the 
female  judges  have  an  opinion  (Q^  * U) . The  alternative  is 
defined  by  (2.3)  analogously.  Then,  in  Theorem  1 and  Corollary 
2 of  Section  2 , we  prove  that  L has  the  same  distribution  under 
Hqq  as  it  does  under  u H^q.  Thus  the  SF  test  cannot  discri- 
minate between  , where  the  two  groups  of  judges  are  governed 
by  the  same  uniform  distribution,  and  u H^q,  where  the  two 
groups  of  judges  are  governed  by  different  distributions,  one  of 
which  is  uniform. 

IV.  The  SF  test  is  not  consistent  against  a large  class  of  alternatives 
where  the  two  groups  of  judges  have  different  opinions.  That  is, 
there  are  (Q^,  Q2)  pairs  in  (defined  by  (2.5)  of  Section  2) 
where  neither  nor  Q2  is  uniform,  but  for  which,  even 

as  m and  n get  arbitrarily  large,  the  SF  test  leads  to  the  decision 
that  the  two  groups  agree. 
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In  Section  3 we  show  that  we  can  apply  a permutation  test  based 
2 

on  the  Mahalanobis  D - statistic,  proposed  in  a different  setting  by 
Wald  and  Wolfowitz  (1944) , to  obtain  a conditionally  distribution- free 
test  for  the  hypothesis  of  agreement  between  the  two  groups  of  judges. 

A convenient  large  sample  approximation  is  available,  and  the  test  is 
consistent  for  a large  class  of  alternatives. 

Section  4 contains  an  application  of  our  conditional  test,  and  the 
SF  test,  to  a set  of  leisure  activity  preferences  data  provided  by 
Sutton  (1976). 

2.  THE  SCHUCANY-FRAWLEY  TEST 

To  understand  the  contents  of  the  Schucany-Frawley  (1972)  paper, 
and  the  SF  test  advocated  there  and  in  the  subsequent  paper  by  Li  and 
Schucany  (1975)  , it  is  helpful  to  consider  the  following  five  subclasses 
of  possible  opinions  (Q^,  for  the  two  groups  of  judges.  Thus,  let 


t 

*4 

i- 


it* 

K 


and 


H00  - {<Qr  Q2>:  Qx  - Q2  - 0). 

H01  * {(Qi;  Q2):  Q1  “ u»  q2  * u}* 


H10  " {(Q1’  q2):  Q1  * U*  Q2  " U>* 


H11  • {(Q1‘  Q2):  Q1  * Q2*  Q1  * U*  Q2  * U}» 


A11  " {(Q1*  Q2):  Q1  * Q2 * Q1  * U*  Q2  * UK 


(2.1) 

(2.2) 

(2.3) 

(2.4) 


(2.5) 


| 

* 
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The  hypothesis  of  agreement  between  the  two  groups  of  judges 
corresponds  to  H = u H^q.  However,  the  hypothesis  of  agreement, 
given  that  each  group  of  judges  has  an  opinion,  corresponds  to  , 
and  the  hypothesis  that  the  judges  have  no  opinion  (in  Kendall’s 
terminology,  the  hypothesis  of  complete  accordance)  corresponds  to  Hq_. 

Schucany  and  Frawley  (1972)  state  that  "...  it  is  meaningless  to 
make  any  comparison  between  groups  unless  each  group  'has  an  opinion' 
i.e.,  there  is  concordance  within  each  group."  They  then  incongruously 
designate  as  the  "null  hypothesis."  At  the  a level  they  propose 
to  reject  in  favor  of  when  L > where  ^00  is  determined  by 


?H0()  (L  2 V * °* 


(2.6) 


If  m and  n are  large,  the  normal  approximation  to  the  distribution  of 
L under  H_n  yields 


l00  = E00a)  + Za{var00(L)}1/2’ 


(2.7) 


where 


EnnU)  - mnk(k  + ir/4. 


(2.8) 


varQ0(l)  = mn(k  - l)k2(k  + 1)2/144, 


(2.9) 


are  the  mean  and  variance,  respectively,  of  L under  HQ0  and  z a is  the 
upper  a percentile  point  of  the  standard  normal  distribution. 
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for  example,  an  extreme  case  where  m ■>  1 and  n - 10.  Then,  as  summarized 
by  the  L statistic,  or  equivalently  p,  a direct  averaging  of  the  10  rank 
correlation  coefficients  gives  too  much  weight  to  the  rank  vector  of  the 
male  judge. 

Secondly,  the  test  is  defined  by  Schucany  and  Frawley  to  discriminate 
between  Hqq  and  when  in  fact  they  state  it  is  meaningless  to  compare 
the  groups  unless  each  group  has  an  opinion.  The  hypothesis  Hqq  asserts 
that  each  group  does  not  have  an  opinion. 

Thirdly,  we  now  show  (Theorem  1 and  Corollary  2)  that  the  distri- 
bution of  L under  is  the  same  as  the  distribution  of  1 under  any 
(Ql,  Q2)  in  u H^q.  Thus,  in  contrast  to  its  designed  intention,  the 
SF  test  actually  can  only  discriminate  between  H11  and  HQ()  u HQ1  u H10 , 
and  the  latter  hypothesis  includes  cases  where  the  two  groups  of  judges 
agree  and  cases  where  the  two  groups  of  judges  disagree. 

Theorem  1 shows  that  when  one  group  of  judges  has  no  opinion,  the 
distribution  of  a general  class  of  statistics,  including  L,  does  not 


depend  on  the  opinion  of  the  second  group  of  judges.  We  call  Theorem  1 
the  lndlstinguishability  theorem. 

THEOREM  1.  Let  gCs^,  ...,  s^;  t^ t^)  be  a function  of  2k 

arguments  with  an  invariance  property  given  by 


i» szMmauK'fx' 


for  each  permutation  (p^ p^)  of  (1,  ....  k).  Then  the  statistic 


8(^i i i < • • i r -I  i r 


•|u)^  ....  r^^  k)  is  distribution-free  under.*.'. 


H00  U H01  U H10‘ 


Proof.  Let  (Q^,  Q2>  e H^q.  Then  Q2  - U.  Define  the  random  permutation 
(Px Pk),  depending  on  (ru r^)  only,  by 


r.  * j , j»l,  ....  k. 

1,PJ 


(2.11) 


Using  the  invariance  property  (2.10)  we  have 


PQ1>U^8^rll’  rlk ’ rm+l,l rurt-l,k*  = 8o* 


PQ1,U{8(1,  k;  Vl>Pl*  •••’  rnH-l,pk)  ° 8o} 


(2.12) 


“ PU{8(1 k;  rm+l ,1 rmH,k*  " 8o*- 

The  last  equality  above  follows  since  (p^»  •••»  Pk)  is  independent  of 

(Vl,l*  •••»  Vl,^  31141  the  dist*ibutlon  of  (Vl.l1  •••*  rm+l,k}  18 

permutation  invariant.  This  proves  that  the  distribution  of 

*<rll rlk;  rm+l,ls  • ’ Vl,^  under  H10  13  the  Sanie  38  Under 

Hqq.  The  same  argument  shows  that  the  distribution  of  g(r^,  •••»  rj^5 

r . , - r . , . ) under  Hm  is  the  same  as  under  This  completes 


m+1 , 1 * •• 
the  proof. 


m+l,k 


COROLLARY  2.  The  statistic  L is  distribution-free  under  HQ0  u HQ^  u H^q. 


Proof.  The  function 


at 

» • • • » ^ » • * • • * I 


satisfies  invariance  property  (2.10).  The  proof  is  completed  by  noting 
that  the  statistic  L is  of  the  form 

m N 

L(rl rN)  " J1  j=|+1  8(ril rik;  rjl rjk)* 

In  addition  to  the  aforementioned  defects  of  the  SF  test,  its 
possible  usefulness  is  further  seriously  weakened  by  the  fact  that  it 
is  not  consistent  against  a large  class  of  (Q^,  Q^)  pairs  in  A^. 

Define  the  vector  of  mean  rankings  of  the  two  groups  of  judges  as  follows: 


y - (y^  ....  uk) , v = (v^  ....  vfc)  , 


(2.13) 


where 


Vj  " EQ1(r.j)>  Vj  “ EQ2(r.j)’  j " 1’  k> 


(2.14) 


and  En  , E denote  that  the  expectation  is  taken  with  respect  to  Q1 , 
gl  Q2 

Q2  respectively.  Then  (S^/m,  Sk/m)  and  (T^/n,  ....  Tk/n)  are 

consistent  estimates  of  y and  v,  respectively.  Thus  if  (Q^,  Q2)  e A^ 


is  such  that 


l Yj 

j=l  3 3 


~{k(k+l)  /4}<  0, 


then,  under  such  a (Q^,  Q2) , the  statistic  {L  - £^^(1) }/{varQQ(L) 
will  tend  to  - 00  and  the  SF  test  will  not  lead  to  the  rejection  of  the 
SF  ‘'null  hypothesis'  hQ(J  and  thus  the  hypothesis  of  complete  accordant 
will  be  (erroneously)  accepted. 


3.  A CONDITIONALLY  DISTRIBUTION-FREE  TEST 


The  basic  hypothesis  testing  problem  of  "agreement"  versus 
"disagreement"  between  the  two  groups  is , in  terms  of  the  hypotheses 
defined  by  (2.1)  - (2.5),  to  discriminate  between  H = Hqq  u versus 
A ■ u H^q  u A^.  Since  each  judge's  rank  vector  can  assume  only 
k!  values,  it  appears  at  first  glance  that  a test  based  on  a multi- 
nomial distribution  with  k!  cells  could  provide  a solution  to  the 
testing  problem.  However,  since  kl  is  usually  large,  and  many  of 
the  k!  rankings  will  not  occur  in  the  data,  such  a test  based  on  the 
multinomial  would  not  be  satisfactory. 

We  therefore  modify  the  testing  problem  slightly  by  restricting 
the  class  of  alternatives  to  those  (Q^,  Q2)  pairs  whose  vectors  of 
mean  ranks  for  the  k objects  are  unequal.  That  is,  in  the  notation  of 
(2.13),  we  will  test  the  hypothesis 

H - ((Q1,  Q2):  Qx  - Q2),  (3.1) 

versus  the  alternative 


L {• 

r t 


»■> 

K: 


A*  **  { (Qx , Q2) : u * v). 


(3.2) 


We  have  thus  reduced  to  problem  to  that  of  testing  for  the  equality  of 
the  mean  vectors  in  two  multivariate  populations.  After  this  reduction,  we 
can  use  a test  suggested  by  Wald  and  Wolfowitz  (1944)  (in  the  context 
of  testing  for  equality  of  two  mean  vectors)  for  our  specific  problem 
of  two  group  agreement. 


r 


If  the  distributions  Q^,  were  multivariate  normal  with  the  same 

covariance  matrix,  the  appropriate  test  for  equality  of  mean  vectors 

2 

would  be  the  normal  theory  test  based  on  the  Mahalanobis  D -distance 
between  the  two  sample  means.  Clearly , here  and  are  not 
multivariate  normal , we  thus' use  the  Wald-Wolfowitz  (1§^4)  conditionally 
distribution-free  test. 

Notice  that  the  covariance  matrix  of  (r  n , . . . , r . ) under  any 

fc1  ,k 

distribution  Q on  will  be  singular,  since  £r  . = k(k+l)/2.  We  will 

1 

therefore  omit  the  ranking  of  the  kC  object  and  use  only  the  rankings 
of  the  first  (k-1)  objects  in  computing  the  Mahalanobis  distance.  In 
this  we  tacitly  assume  that  the  covariance  matrix  of  (r  ...,  r 
under  Q is  non-singular.  Certain  obvious  modifications  will  have  to 
be  made  if  this  covariance  is  singular. 

Let 


8j  “ Sj^m’  " Tj^n*  J ” 1»  **•»  k ~ 1» 


(3.3) 


and  let 


N 

Cjj"  * E (rij  " (rij'  - y)/(N  - 1),  1 ^ j,  j'  ^ k - 1,  (3. A) 

N 

where  r * £ r. ./N,  j = 1 k-1.  Setting  s = (s. , ...,  s.  .) , 

j i*l  ■** 

t ■ (t^,  ....  tfc_1)  , and  C to  be  the  (k  - 1)  x (k  - 1)  matrix  of  the 

c^j^s,  our  proposed  test  rejects  H in  favor  of  A*  if 


..  i i r \ w gMrri  i r i \ 


! 


I **■ 

t ^ 

i ':J? 


Is  large.  The  statistic  B Is  montonically  related  to  the  Mahalanobls 
2 

D -statistic.  Since  B Is  not  distribution-free  under  H,  we  turn  to 
a permutation  test  which  is  formally  described  as  follows. 

Consider  the  group  H of  permutation  transformations  it  that  apply 
to  a vector  of  N rank  vectors  ...»  * (<0^,  ...»  “ik)eQ* 

1-1,  ...»  N,  as  follows: 

...»  (iJjj)  - (faj^  , ...»  (D^  ), 

IN 

where  n * (ir^ , ...»  ttn)  is  a permutation  of  (1,  2,  ...»  N).  There 
are  N!  transformations  in  II.  Let  n(o)^»  ...»  <i>N)  denote  the  orbit  of 
(w^,  ...»  Ujj)  , that  is 

II  (lil^  » ...»  { 7T  » ...»  Wjj)  * TT  C n}. 

Under  H, 

P{(r^»  ...»  r^)  “ (r* » • • • » tj*p  | (r^»  ...»  r^)  e II  » ...»  } 

] (N!)-1  if  (r*»  ...»  r*)  e n(u>;L , ...»  u^) 

I 0 otherwise. 

Thus  the  conditional  distribution  of  (r^»  ...»  r^)  given  that  it 
belongs  to  the  orbit  of  (w^,  ...»  u^)  is  distribution- free  under  H. 

This  conditional  distribution  is  called  the  permutation  distribution. 
Define  the  critical  value  BQ(a)^,  ...»  by  the  equation 

Pjj{.B(r^»  ...»  r^j)  i B^(w^»  ...»  wjj)  I Cr^»  .«•*  £ B(w^»  ...»  Ujj)  )*  o. 
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O'11  a - level  permutation  test  rejects  H if 


B(r1, 


rN> 


2 V'r 


rM)  ■ 


(3.6) 


Since  the  statistic  B is  invariant  under  the  m!  permutations  of  the 
male  rank  vectors  among  themselves  and  invariant  under  the  n!  permutations 
of  the  female  rank  vectors  among  themselves,  our  proposed  test  requires 
the  calculation  of  B,  not  for  each  ir  e II , but  only  for  the  M ■ (^)  it's 
corresponding  to  the  possible  choices  of  m rank  vectors  to  serve  as  the 
male  rank  vectors.  Thus  when  R ■ (r^,  ...»  rN>  is  observed,  let 
b^(R)  £ ...  £ bM(R)  denote  the  ordered  values  of  B(u(R))  for  these  M 
transformations.  When  a ■ d/M,  our  test  rejects  H in  favor  of  A*  if 
B(R)  is  one  of  the  d largest  b values. 

Even  though  C 1,  appearing  in  (3.5),  is  unchanged  by  permutations, 

N 

the  computations  of  the  (m)  values  of  B,  when  m and  n are  large,  are 
formidable.  In  such  cases,  the  following  chi-square  approximation 
can  be  used. 

Wald  and  Wolfowitz  (1944)  have  shewn  that  under  H the  permutation 
distribution  of  B has  a limiting  chi-square  distribution  with  k-1  degrees 
of  freedom,  assuming  that  the  covariance  matrix  of  (r  ...»  r ^_^j) 
under  Q^C^)  is  non-singular.  Thus  the  large  sample  approximation 
to  the  o level  test  defined  by  (3.6)  is  reject  H if 


B 2 


2 

xu,k-l 


(3.7) 
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2 

where  x v i ^ the  upper  a percentile  point  of  a chi-square  distribution 

01 9 K-  X 

with  k-1  degrees  of  freedom. 

Consistency  of  the  permutation  test  is  established  as  follows. 

When  (Q^ , e A*  and  the  covariance  matrices  of  (r  ^ r ^-1)) 

under  and  are  non-singular,  the  results  of  Wald  and  Wolfowitz 
(1944)  show  that  in  the  permutation  distribution,  i^T" (s  - t)  has  a 
limiting  multivariate  normal  distribution  with  a mean  vector  u - v 
[where  here  y,  v are  the  corresponding  k-1  dimensional  versions  of 
(2.13)]  which  is  non-zero.  Thus  B tends  to  °°  and  the  permutation 
test  based  on  B is  consistent  for  all  such  alternatives  in  A*. 

4.  AN  EXAMPLE 

Sutton  (1976)  has  studied  leisure  preferences,  and  attitudes  on 
retirement,  of  the  elderly  with  the  aim  of  providing  leisure  programs 
that  meet  the  needs  and  goals  of  those  participating.  She  cites  evidence 
that,  in  the  United  States,  existing  senior  programs  seem  to  be  geared 
to  fitting  clients  to  activities  rather  than  planning  activities  with 
the  individual's  needs  and  goals  in  mind.  In  a sample  of  elderly 
retirees  residing  in  Leon  County,  Florida,  Sutton  asked  a number  of 
questions  designed  to  determine  preferences  for  selected  "activity 
components."  Activity  components  are  elements  within  activities  such 
as  where  the  activity  takes,  with  whom  the  activity  is  done,  and  the 
type  of  leadership  preferred  during  the  activity.  The  data  in  Table  1 
are  the  responses  of  m **  14  white  females  and  n * 13  black  females , 
in  the  age  group  70-79  years,  to  the  question:  With  which  sex  do  you 
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as  , 


r ! 


three  responses:  male(s) , female(s) , both  sexes,  scoring  1 for  the 
most  desired  or  first  choice  and  3 for  the  least  desired  or  third 
choice. 

Table  1.  Preferred  companions  for  leisure  time 
activities  of  elderly  females 
(data  of  C.  Sutton) 


male{s) 

3 

3 

3 

3 

3 


female (s) 
1 
2 
2 
2 
2 


both  sexes 
2 
1 
1 
1 
1 


2 


3 


r 


1 


Black 

Females 


3 

1 

3 

2 

3 

2 

1 

3 

2 

2 

2 

3 

3 


2 

2 

2 

3 

2 

3 

3 

2 

3 

3 

3 

2 

2 


1 

3 

1 

1 

1 

1 

2 

1 

1 

1 

1 

1 

1 


T's:  30 


32 


16 


f. 

r. 


* . 
k 


f- 

ir 

N 


For  these  data  there  Is  evidence  that  the  white  females  have  an 

opinion  and  that  the  black  females  have  an  opinion.  Friedman's  (1937) 

2 

Xr  statistic  (which,  except  for  constants, is  equivalent  to  the  Kendall 
and  Bablngton  Smith  coefficient  W)  for  white  females  is  18.4.  Referring 
this  value  to  the  chi-square  distribution  with  two  degrees  of  freedom 
yields  a P value  less  than  .001  for  the  hypothesis  of  accordance  among 
white  females.  The  corresponding  values  for  the  black  females  are 
X*  - 11.7,  P “ .003. 
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r 


1 


We  now  use  the  conditionally  distribution-free  test  to  see  whether 
the  white  females  and  black  females  agree.  From  Table  1,  (3.3),  and 
(3.4)  we  obtain 


(sis  s2) 


(2.929,  1.429),  (t^  t^  = (2.308,  2.462) 


-1 


.3960,  -.2593 
[-.2593,  .5328 

3.706,  1.804 
[1.804,  2.755 


and  from  (3.5), 


B « 13.8. 


£ 


if 

N 


9 

$ 


27 

There  are  (^4)  = 20,058,300  possible  ways  to  pick  14  of  the  27  rank 
vectors  to  serve  as  the  rank  vectors  corresponding  to  the  white  females. 
Of  these,  only  4178  choices  yield  B values  that  are  greater  than  or 
equal  to  the  observed  value  of  B * 13.8.  Thus  the  exact  P value  for 
the  conditional  test  is  4178/20,058,300  = .0002.  This  constitutes 
very  strong  evidence  that  the  white  female  retirees  have  a different 
opinion  than  the  black  female  retirees.  The  same  conclusion  is  reached 
using  the  chi-square  approximation,  to  the  conditional  distribution  of 
B,  given  in  Section  3.  Referring  B = 13.8  to  the  chi-square  distribution 
with  two  degrees  of  freedom  yields  an  approximate  P value  of  .001. 

Quite  the  opposite  erroneous  conclusion  is  reached  by  referring 
L to  its  Hqo  distribution  as  recommended  by  Schucany  and  Frawley  (1973). 
We  find,  from  (1.2),  (2.8),  and  (2.9), 
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2.83. 


L " E00(L)  _ 2238  - 2184 

{var^U)}*4  {364}^ 


The  Schucany-Frawley  normal  deviate  of  2.83  gives  the  incorrect  impression 
that  the  observed  value  of  L is  extremely  large,  and  according  to  the 
SF  test,  this  "large"  value  leads  to  the  acceptance  of  H^. 
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